Skip to content

[Dlight] Fix general reduction rule to support non-last reduction axis#17754

Merged
jinhongyii merged 1 commit into
apache:mainfrom
MasterJH5574:tvm-dev/2025-03-16-dlight-sfm
Mar 17, 2025
Merged

[Dlight] Fix general reduction rule to support non-last reduction axis#17754
jinhongyii merged 1 commit into
apache:mainfrom
MasterJH5574:tvm-dev/2025-03-16-dlight-sfm

Conversation

@MasterJH5574

Copy link
Copy Markdown
Contributor

This PR fixes a bug in the general reduction dlight rule, which happens when there is a trailing spatial block, and for the previous reduction blocks, the reduction axes are not on the back.

In the case above, the loop orders of the reduction blocks and the trailing spatial block are inconsistent, while the dlight rule before this fix always treat the loop orders as consistent.

As a result, though the function after applying the rule is numerically correct, it may require much extra shared memory use (in proportion to the size of spatial loops). And when the spatial dimensions are large, the required share memory size may exceed the device limit.

This PR fixes this bug and adds a unit test.

This PR fixes a bug in the general reduction dlight rule, which happens
when there is a trailing spatial block, and for the previous reduction
blocks, the reduction axes are not on the back.

In the case above, the loop orders of the reduction blocks and the
trailing spatial block are inconsistent, while the dlight rule before
this fix always treat the loop orders as consistent.

As a result, though the function after applying the rule is numerically
correct, it may require much extra shared memory use (in proportion to
the size of spatial loops). And when the spatial dimensions are large,
the required share memory size may exceed the device limit.

This PR fixes this bug and adds a unit test.
@MasterJH5574 MasterJH5574 force-pushed the tvm-dev/2025-03-16-dlight-sfm branch from 2c257a6 to 64b7520 Compare March 17, 2025 17:38
@jinhongyii jinhongyii merged commit dafd053 into apache:main Mar 17, 2025
ShiboXing pushed a commit to ShiboXing/tvm that referenced this pull request Aug 10, 2025
apache#17754)

This PR fixes a bug in the general reduction dlight rule, which happens
when there is a trailing spatial block, and for the previous reduction
blocks, the reduction axes are not on the back.

In the case above, the loop orders of the reduction blocks and the
trailing spatial block are inconsistent, while the dlight rule before
this fix always treat the loop orders as consistent.

As a result, though the function after applying the rule is numerically
correct, it may require much extra shared memory use (in proportion to
the size of spatial loops). And when the spatial dimensions are large,
the required share memory size may exceed the device limit.

This PR fixes this bug and adds a unit test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants